[SDK] Implement the ProbabilitySampler#4135
Conversation
7a9d7a0 to
44a42f3
Compare
|
here is a demo against this branch to show the behaviour end to end. it shows the threshold boundary at R >= T, the th encoding, rv taking precedence over the trace id, th replacement, and a foreign tracestate entry surviving untouched. output: demo.cc#include <cstdint>
#include <cstdio>
#include <map>
#include <string>
#include <utility>
#include <vector>
#include "opentelemetry/common/key_value_iterable_view.h"
#include "opentelemetry/sdk/trace/samplers/probability.h"
#include "opentelemetry/trace/span_context.h"
#include "opentelemetry/trace/span_context_kv_iterable_view.h"
#include "opentelemetry/trace/trace_state.h"
#include "src/common/random.h"
namespace trace_api = opentelemetry::trace;
using opentelemetry::sdk::common::Random;
using opentelemetry::sdk::trace::Decision;
using opentelemetry::sdk::trace::ProbabilitySampler;
namespace {
trace_api::TraceId TraceIdWithLow56(uint64_t v) {
uint8_t buf[16] = {0};
for (int i = 0; i < 7; ++i)
buf[15 - i] = static_cast<uint8_t>(v >> (8 * i));
return trace_api::TraceId(buf);
}
opentelemetry::sdk::trace::SamplingResult
Sample(ProbabilitySampler &s, const trace_api::SpanContext &ctx,
trace_api::TraceId tid) {
std::map<std::string, int> attrs;
std::vector<
std::pair<trace_api::SpanContext, std::map<std::string, std::string>>>
links;
opentelemetry::common::KeyValueIterableView<decltype(attrs)> attrs_view{
attrs};
trace_api::SpanContextKeyValueIterableView<decltype(links)> links_view{links};
return s.ShouldSample(ctx, tid, "demo", trace_api::SpanKind::kInternal,
attrs_view, links_view);
}
const char *Header(const opentelemetry::sdk::trace::SamplingResult &r) {
static std::string h;
h = r.trace_state ? r.trace_state->ToHeader() : "(parent tracestate kept)";
return h.c_str();
}
} // namespace
int main() {
// ratio 1.0: always sampled, threshold 0 emitted as ot=th:0
ProbabilitySampler all(1.0);
auto r =
Sample(all, trace_api::SpanContext::GetInvalid(), TraceIdWithLow56(0));
printf("ratio 1.0 -> %s, tracestate: %s\n",
r.decision == Decision::RECORD_AND_SAMPLE ? "SAMPLE" : "DROP",
Header(r));
// ratio 0.5: threshold is 2^55. R = 2^55 samples, R = 2^55-1 drops.
ProbabilitySampler half(0.5);
r = Sample(half, trace_api::SpanContext::GetInvalid(),
TraceIdWithLow56(0x80000000000000));
printf("ratio 0.5, R=0x80000000000000 -> %s, tracestate: %s\n",
r.decision == Decision::RECORD_AND_SAMPLE ? "SAMPLE" : "DROP",
Header(r));
r = Sample(half, trace_api::SpanContext::GetInvalid(),
TraceIdWithLow56(0x7fffffffffffff));
printf("ratio 0.5, R=0x7fffffffffffff -> %s\n",
r.decision == Decision::RECORD_AND_SAMPLE ? "SAMPLE" : "DROP");
// 100000 random trace ids at ratio 0.5
int sampled = 0;
for (int i = 0; i < 100000; ++i) {
uint8_t buf[16];
Random::GenerateRandomBuffer(buf);
if (Sample(half, trace_api::SpanContext::GetInvalid(),
trace_api::TraceId(buf))
.decision == Decision::RECORD_AND_SAMPLE)
++sampled;
}
printf("ratio 0.5, 100000 random trace ids -> sampled %d (%.3f)\n", sampled,
sampled / 100000.0);
// explicit rv in the parent tracestate wins over trace id randomness
uint8_t id[16] = {1};
trace_api::SpanContext rv_high(
trace_api::TraceId(id), trace_api::SpanId(), trace_api::TraceFlags{0},
false, trace_api::TraceState::FromHeader("ot=rv:ffffffffffffff"));
r = Sample(half, rv_high, TraceIdWithLow56(0));
printf("ratio 0.5, rv=ff.. trace id low bits 0 -> %s, tracestate: %s\n",
r.decision == Decision::RECORD_AND_SAMPLE ? "SAMPLE" : "DROP",
Header(r));
// existing th is replaced, other entries survive
ProbabilitySampler quarter(0.25);
trace_api::SpanContext with_th(
trace_api::TraceId(id), trace_api::SpanId(), trace_api::TraceFlags{0},
false, trace_api::TraceState::FromHeader("ot=th:8,congo=t61rcWkgMzE"));
r = Sample(quarter, with_th, TraceIdWithLow56(0xffffffffffffff));
printf("ratio 0.25, parent 'ot=th:8,congo=..' -> %s, tracestate: %s\n",
r.decision == Decision::RECORD_AND_SAMPLE ? "SAMPLE" : "DROP",
Header(r));
return 0;
} |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #4135 +/- ##
==========================================
+ Coverage 82.89% 82.98% +0.10%
==========================================
Files 405 406 +1
Lines 17292 17399 +107
==========================================
+ Hits 14332 14437 +105
- Misses 2960 2962 +2
🚀 New features to boost your workflow:
|
44a42f3 to
4685f2b
Compare
|
about last force push-
|
| /** | ||
| * Converts a ratio in [0, 1] to a 56-bit rejection threshold. | ||
| */ | ||
| uint64_t CalculateRejectionThreshold(double ratio) noexcept |
There was a problem hiding this comment.
Can we reuse CalculateThreshold, most of the logic seems same?
There was a problem hiding this comment.
Okay. I'm reusing CalculateThreshold and deleting the duplicate.
| return hex; | ||
| } | ||
|
|
||
| uint64_t RandomnessFromTraceId(const trace_api::TraceId &trace_id) noexcept |
There was a problem hiding this comment.
Okay. Reusing GetRandomnessFromTraceId.
| } | ||
|
|
||
| if (randomness < threshold_) | ||
| return {Decision::DROP, nullptr, {}}; |
There was a problem hiding this comment.
If this sampler drops the span, it returns no updated tracestate, so the child span falls back to the parent tracestate and can keep an old ot=th value. Since this is a new independent probability decision, that threshold is no longer valid. Can you check the logic here.
There was a problem hiding this comment.
Fixed it to strip th on the drop path and return the recomputed tracestate, with rv and the other sub keys preserved. Added a test.
4685f2b to
71a91df
Compare
| { | ||
| if (parent_context.IsValid() && !parent_context.trace_flags().IsRandom()) | ||
| { | ||
| static std::atomic<bool> warned{false}; |
There was a problem hiding this comment.
Non-blocking: probably sample on the warning instead of just showing the first warning.
There was a problem hiding this comment.
Kept it once-per-process to stay off the per-span hot path.
| } | ||
|
|
||
| std::string new_ot_value = SetThresholdSubKey(ot_value, EncodeThreshold(threshold_)); | ||
| if (!trace_api::TraceState::IsValidValue(new_ot_value)) |
There was a problem hiding this comment.
Same stale-th class of issue you fixed for the drop path can still occur on the sample path: when IsValidValue(new_ot_value) is false or the tracestate is already full, we return an empty trace_state, so the child inherits the parent's (possibly stale) ot=th. Should we strip the parent th (reusing the drop-path OtelTraceState logic) before returning RECORD_AND_SAMPLE in those branches?
There was a problem hiding this comment.
Yes. Reused the drop-path strip on the invalid-ot branch. The full branch is !has_ot, so nothing stale there. Added a test.
There was a problem hiding this comment.
Pull request overview
Implements the SDK-side ProbabilitySampler (per the consistent probability sampling spec) and wires it into composable sampler configuration, alongside tests and documentation updates.
Changes:
- Add
ProbabilitySampler/ProbabilitySamplerFactoryimplementation, including tracestate (ot) threshold (th) handling. - Integrate ProbabilitySampler into SDK configuration for composable probability sampling.
- Add comprehensive unit tests plus build-system, docs, and changelog updates.
Reviewed changes
Copilot reviewed 11 out of 11 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| sdk/src/trace/samplers/probability.cc | New sampler implementation (threshold calc, rv/th tracestate handling, sampling decision logic). |
| sdk/src/trace/samplers/probability_factory.cc | Factory for creating ProbabilitySampler instances. |
| sdk/include/opentelemetry/sdk/trace/samplers/probability.h | Public sampler header/API. |
| sdk/include/opentelemetry/sdk/trace/samplers/probability_factory.h | Public factory header/API. |
| sdk/src/trace/CMakeLists.txt | Register new sampler sources in the SDK trace library build. |
| sdk/src/configuration/sdk_builder.cc | Switch composable probability sampler config to use ProbabilitySampler. |
| sdk/test/trace/probability_sampler_test.cc | New unit tests covering sampling decisions and tracestate behavior. |
| sdk/test/trace/CMakeLists.txt | Add new test to CMake test list. |
| sdk/test/trace/BUILD | Add new Bazel cc_test target for ProbabilitySampler tests. |
| docs/public/sdk/GettingStarted.rst | Document ProbabilitySampler as an available SDK sampler. |
| CHANGELOG.md | Add changelog entry for the new sampler. |
| auto parent_trace_state = parent_context.trace_state(); | ||
| std::string ot_value; | ||
| bool has_ot = parent_trace_state->Get(kOtTraceStateKey, ot_value); |
| if (model->ratio > 0.0) | ||
| { | ||
| sampler = opentelemetry::sdk::trace::ProbabilitySamplerFactory::Create(model->ratio); | ||
| } | ||
| else | ||
| { | ||
| sampler = opentelemetry::sdk::trace::AlwaysOffSamplerFactory::Create(); | ||
| } |
…tale th on the drop path
Signed-off-by: ayush-that <ayush1337@hotmail.com>
Signed-off-by: ayush-that <ayush1337@hotmail.com>
91e7d59 to
210adcb
Compare
Fixes #4127
ref: https://opentelemetry.io/docs/specs/otel/trace/sdk/#probabilitysampler
Changes
adds the ProbabilitySampler from the spec. it turns the configured ratio into a 56 bit threshold, takes the randomness value from the rv key in the ot tracestate if there is one, otherwise from the last 7 bytes of the trace id, and samples the span when the randomness is greater than or equal to the threshold. sampled spans get the threshold written into the tracestate as the th key. the file configuration for the composable probability sampler now uses this sampler instead of TraceIdRatioBasedSampler.
the threshold keeps full 56 bit precision (same as opentelemetry-java-contrib). composable samplers are a separate effort (#4028), and the deprecated TraceIdRatioBased sampler is not changed.
CHANGELOG.mdupdated for non-trivial changes